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Abstract 

Data aggregators collect large amount of information about indi¬ 
vidual users and create detailed online behavioral profiles of indi¬ 
viduals. Behavioral profiles benefit users by improving products 
and services. However, they have also raised concerns regarding 
user privacy, transparency of collection practices and accuracy 
of data in the profiles. To improve transparency, some compa¬ 
nies are allowing users to access their behavioral profiles. In this 
work, we investigated behavioral profiles of users by utilizing 
these access mechanisms. Using in-person interviews (n=8), we 
analyzed the data shown in the profiles, elicited user concerns, 
and estimated accuracy of profiles. We confirmed our interview 
findings via an online survey (n=100). To assess the claim of im¬ 
proving transparency, we compared data shown in profiles with 
the data that companies have about users. More than 70% of 
the participants expressed concerns about collection of sensitive 
data such as credit and health information, level of detail and how 
their data may be used. We found a large gap between the data 
shown in profiles and the data possessed by companies. A large 
number of profiles were inaccurate with as much as 80% inaccu¬ 
racy. We discuss implications for public policy management. 

1 Introduction 

The online services landscape is driven by a data economy in 
which data aggregators and service providers trade user informa¬ 
tion on data marketplaces Collbl. As part of the data economy, 
data aggregators and service providers collect extensive amount 
of data about individuals from multiple sources, including pub¬ 
lic, online and offline sources m. By combining information 
from multiple sources, they create behavioral profiles of indi¬ 
viduals. The data and profiles may be used for purposes such 
as personalization, risk mitigation products, people search and 
targeted advertising Qol. The data economy benefits users by 
providing better products and services. It also sustains many free 
services such as search and social networking. However, the data 
economy also raises privacy concerns. For example, studies have 
found that users have privacy concerns when behavioral profiles 
are used for advertising ll26l [TSl l24l |2l. Using profile data for 
risk mitigation services such as background checks raises con¬ 
cerns about accuracy of data Ho). 

In this work, we investigated online behavioral profiles of 
users. To study behavioral profiles, we used access mechanisms 
provided by companies that allow users to look at data in their 
profiles Hiniisi. Companies have started providing access to 


improve transparency of their data collection practices. First, we 
analyzed the types of data found in actual profiles. A prior re¬ 
port by the Federal Trade Commission investigated the types of 
data that companies may potentially use to build behavioral pro¬ 
files. However, it did not look at contents of actual profiles ifTOl . 
We compared the data shown in profiles with the data compa¬ 
nies claim to possess about users. This allowed us to evaluate the 
claim of increasing transparency by providing access to profiles. 

Second, we studied user concerns and surprises regarding data 
in their behavioral profiles. Prior studies have focused on user 
concerns and perceptions regarding use of behavioral profiles for 
advertising We focus on user privacy concerns regard¬ 

ing actual contents of behavioral profiles. Our approach of us¬ 
ing user’s own behavioral profile for eliciting concerns and sur¬ 
prises leads to a more contextualized and nuanced understanding 
of user concerns regarding online behavioral profiles. Further, 
we were able to estimate the accuracy of data in user profiles. 
Our study also provides insight into usability of access mecha¬ 
nisms. 

To understand contents and concerns of behavioral profiles, 
we first conducted semi-structured interviews in which we asked 
our participants (n=8) to look at their own profiles. We elicited 
their surprises and concerns regarding the data in their profiles, 
and documented the types of data found in their profiles. Subse¬ 
quently, we conducted an online survey (n=100) to confirm the 
identified user concerns with a more diverse audience. We inves¬ 
tigated the types of data that companies possess about users by 
surveying documents published by data aggregators and service 
providers. 

We organize the rest of the paper as follows. In Section|2l we 
provide background information on the data economy and access 
mechanisms provided by companies. We discuss related work in 
Section [3] In Section 01 we provide an overview of our study 
including interview details. In Section|5] we present our findings 
regarding contents of actual behavioral profiles and contrast it 
with data possessed by companies. In Section|6] we discuss user 
concerns and surprises. In Section [T] we provide details of our 
online survey and its results. Lastly, in Section|9l we discuss the 
insights gained from our study and conclude with a discussion of 
implications for public policy and future research. 


2 Background 

To provide necessary background for the rest of the paper, we 
briefly discuss the data economy and access mechanisms. 



—► User data 
Products and 
services 

—^ User profiles 


Figure 1: A conceptual model of the data economy 


The Data Economy: In Fig. [T] we show a simple conceptual 
model of the data economy that highlights the role of data ag¬ 
gregators. Users provide their personal data to public and private 
sector service providers when they receive products and services 
from these service providers. We group all entities such as web¬ 
sites, offline stores, advertisers and marketers under the umbrella 
of private sector service providers. Data aggregators collect dif¬ 
ferent types of user data available from service providers and 
also via direct engagement with users. Public sources of infor¬ 
mation include census data, voter registration databases, occupa¬ 
tion data from state license boards, bankruptcy records, county 
deed and tax assessor records, and Yellow-pages directories E). 
Private sources of information include offline and online surveys, 
in-store and online transactions, website and forum interactions, 
and social networking activity E). Data aggregators combine the 
data obtained from these sources and build behavioral profiles of 
individual users. The data and behavioral profiles are traded on 
data marketplaces 0. Service providers can purchase data and 
profiles, and use it to enrich their knowledge about their cus¬ 
tomers, which may help them to improve their services. 

Accessing Online Behavioral Profiles: To increase trans¬ 
parency, some companies allow users to access their behavioral 
profiles. Companies may choose to show only some of the data 
that they have collected about the user fT] . In addition to looking 
at their data, users may be able to edit data in their profiles. Com¬ 
panies may use client-side or server-side validation to provide ac¬ 
cess to user profiles. For example, BlueKai Q, Google ifT^ and 
Yahoo ED provide access to profiles based on browser cookies. 
Companies such as Acxiom m and Microsoft lfT9ll require that 
users create an account with them and sign-in to access their pro¬ 
files. To create an account users may have to provide their email 
address and name. Additionally, companies may request infor¬ 
mation such as full legal name, full address (street, city, state 
and Zip code), date of birth and last four digits of social secu¬ 
rity number to verify the identity of a user ID. In Fig. m we 
show examples of the three profiles, BlueKai Registry, Google 
Ad Settings, and Yahoo Ad Interests, used in our study. 


3 Related Work 

The Federal Trade Commission recently released a comprehen¬ 
sive report on the activities of data aggregators (brokers) M- 
The report details how data is acquired using various data sources 
and collection techniques. It discusses the types of data col¬ 
lected, potential uses, and steps taken by aggregators to maintain 
data accuracy. The report does not investigate contents of actual 
profiles. 


Several studies have looked into how advertisers use different 
technical measures such as browser cookies, flash cookies and 
Javascript to collect different types of data nmia and track 
user activities El ED- These studies have largely focused on 
data collection from online sources and not on data collection 
from public and offline sources. 

Prior research has studied user understanding, perceptions and 
concerns regarding targeted advertising, which uses behavioral 
profiles to personalize ads. Turow et al. surveyed Americans’ at¬ 
titudes towards targeted advertising that used data collected from 
online websites and offline stores ||25]| . They used telephone in¬ 
terviews and closed-ended questions to understand attitudes of a 
representative sample of the US adult population. McDonald and 
Cranor studied users’ understanding of targeted advertising and 
technical mechanisms such as cookies used for targeted advertis¬ 
ing, and user concerns regarding targeted advertising ifTSll . Ur et 
al. studied user beliefs, attitudes and concerns regarding targeted 
advertising using semi-structured interviews ll26l . Agarwal et 
al. studied users concerns regarding embarrassing and suggestive 
ads that may arise out of targeted advertising El . Gomez et al. 
studied user concerns regarding advertiser data practices by look¬ 
ing at three sources of information: consumer complains to the 
FTC and other organizations, results from user surveys regarding 
privacy, and published news articles ED- Kelley et al. studied 
concerns about location-based advertising and identified differ¬ 
ent factors that influence users’ level of concern m. These stud¬ 
ies have not investigated privacy concerns regarding actual con¬ 
tents of behavioral profiles, and they have not employed users’ 
own behavioral profiles. 


4 Methodology 

In this section, we provide an overview of our study. We explain 
our choice of behavioral profiles. Lastly, we discuss details of 
in-person interviews we conducted. 

4.1 Overview of the Study 

We first conducted in-person interviews where we asked our par¬ 
ticipants to access their own profiles. From the interviews, we 
gathered and categorized the information observed in the behav¬ 
ioral profiles of our participants. During the interviews, we also 
elicited participants’ concerns and surprises regarding informa¬ 
tion in their profiles. We discuss details of the in-person inter¬ 
views below. 

After conducting in-person interviews, we conducted an on¬ 
line survey. We designed the survey to achieve two goals. First, 
we wanted to confirm whether a more diverse population of users 
agreed with the concerns that we had identified from the inter¬ 
views. Second, we wanted to identify additional user concerns. 
We defer discussion of the online survey until we have discussed 
our findings from the in-person interviews. Details of the online 
survey are in Section|2] 

4.2 Selection of Behavioral Profiles 

We studied behavioral profiles from three companies: BlueKai 
Registry, Google Ad Settings, and Yahoo Ad Interests (see 
Fig. ID). As discussed in SectionE] these are cookie-based profiles 
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Figure 2: Sample profiles: BlueKai Registry (left), Google Ad Settings (middle), and Yahoo Ad Interests (right) 


that do not require users to create accounts or signin on data ag¬ 
gregator websites. We felt participants would find cookie-based 
profiles easier to access. 

We chose profiles that covered large number of users and con¬ 
tained data from multiple sources. By doing so, we expect our 
results to be more representative. Data in the BlueKai Registry 
profiles come from nearly 30 third-party companies that partic¬ 
ipate in the BlueKai Audience Data Marketplace 0. Currently 
the marketplace is the world’s largest third-party data market¬ 
place providing data on 300 million users or approximately 80% 
of the US population. Google Ad Settings displays interests and 
other information inferred from user activities on Google and 
more than one million partner sites Ha. Yahoo Ad Interests 
shows data inferred from Yahoo’s sites and services ED. 

4.3 In-person Interviews 

We conducted semi-structured in-person interviews with eight 
participants. We explained to the participants that companies 
may collect data about them, and may create behavioral pro¬ 
files. We informed the participants that they might be able to 
access their profiles. We requested them to look at their pro¬ 
files (BlueKai, Google and/or Yahoo), and if they felt comfort¬ 
able, share information in their profile with us. We asked them 
to voice any concerns, surprises or questions regarding the data 
in their profiles. 

Participant Background: Our participant pool included grad¬ 
uate students with engineering and/or science background. Only 
one participant was aware that he could access profiles created 
by companies. Six participants had never deleted cookies from 
their browser, one deleted cookies selectively, and one regularly 
deleted cookies. 

Data Collection: From our eight participants, we collected in¬ 
formation on eight profiles including five BlueKai Registry pro¬ 
files, two Google Ad Settings profiles, and one Yahoo Ad Inter¬ 
ests profile. All of the participants tried to access their BlueKai 
profile. Six were able to access their Bluekai profile, but two 
were not able to access it as they were using cookie blocking 
and/or script blocking. Of the six participants who accessed their 
Bluekai profile, five shared profile information with us. Three of 
our participants looked at an additional profile; two looked at 
Google profile and one looked at Yahoo profile, and they shared 
profile information with us. 


Data Collection Challenges: Seven of our participants 
showed us contents from their profiles. Six of them pointed 
out specific items from their profiles that concerned or surprised 
them, but did not share the entire profile with us. The primary 
reason for this was the time it took to share the entire profile. 
Due to the way the profiles were displayed, it was not possi¬ 
ble to download entire contents of a profile into a spreadsheet or 
XML document. To share information, a participant had to take 
screen shots of each page in the profile. When there were mul¬ 
tiple pages, for example, 99 pages in the sample BlueKai profile 
in Fig.|2] participant had to click through the pages. Further, in¬ 
dividual entries in a page in a BlueKai profile were images and 
not text, and, hence, it was not possible to copy and paste entries 
into a text document. Since we studied cookie-based profiles, 
participants were accessing profiles from their work or personal 
computers, and most did not want us to access it in their absence. 
One participant, who had an extensive profile, provided us access 
to his computer. It took us over an hour to copy the entire profile. 
This is one of the usability issues in accessing profiles. 

5 Contents of Behavioral Profiles 

We discuss the types of data found in behavioral profiles of our 
interview participants. Further, based on the information pos¬ 
sessed by companies that feed data into BlueKai profiles, we dis¬ 
cuss what other types of data may be found in behavioral profiles. 

5.1 Analysis of Profiles 

We analyzed the eight profiles from our interview participants 
for different types of data. We had profiles from three different 
companies, Bluekai, Google and Yahoo. We computed size or 
the total number of data items in each profile. The two Google 
profiles had ~120 items, the Yahoo profile had ~25 items, one 
Bluekai profile had ~10 items, two BlueKai profiles had ~30 
items, and two BlueKai profiles had ~570 items. Based on the 
number of items, we can say that we had four small-sized pro¬ 
files, two medium-sized profiles and two large-sized profiles. 

We organized the data from these profiles into seven cate¬ 
gories: demographic, geographic, technical, predictive, psycho¬ 
graphic, behavior and life event. We based our categories on the 
categories commonly used in data marketplaces and privacy poli¬ 
cies to describe different types of data. We tried to choose dis¬ 
tinct, non-overlapping categories, so that each data item would 
fall into only one category. However, for some data items, it 
was difficult to choose a category. For example, it was difficult 























to decide whether “Credit Card Holder” belonged to individual 
demographic or behavioral data category. 

A challenge during analysis was to comprehend profile items. 
For example, the meaning of “Demographic > High Confidence” 
and “Credit Card Interest Score” was not clear. Does “High Con¬ 
fidence” imply that the user has high confidence or that the com¬ 
pany has high confidence in the accuracy of demographic data? 
Does “Interest Score” mean how much interest a user is paying 
or how much he is interested in getting a new card? We were able 
to resolve some of the ambiguities by reading several documents 
published by data companies. For example, we resolved “High 
Confidence” as implying data accuracy, but were unable to dis¬ 
ambiguate the meaning of “Interest Score.” Our participants also 
had difficulty with comprehension. We consider this as another 
usability issue in accessing profiles. 

Geographic category was present in Yahoo and BlueKai pro¬ 
files. Only Yahoo profile contained technical category. All three 
profiles showed individual demographic data regarding gender 
and age. However, BlueKai profile contained additional indi¬ 
vidual demographic data including marital status, education and 
occupation. It also contained demographic data related to user’s 
household and work. The remaining categories appeared only in 
the BlueKai profiles. 

Note that if a profile from a company does not show a certain 
category, it does not imply that the company does not have such 
information; a company may choose not to show some of the cat¬ 
egories. Yahoo, for example, states on its Ad Interests Manager 
page “In addition to the information shown here, Yahoo! may 
use ... information provided by partners to help customize some 
of the ads... ED.” Further, “Yahoo! may combine information, 
including personally identifiable information, that we have about 
you with information we obtain from our trusted partners,” and 
BlueKai is one of its trusted partners Ell- In terms of improving 
transparency by allowing users to see the data in their profiles, 
BlueKai profiles are better than Google and Yahoo profiles be¬ 
cause they provide more detailed information. 

5.2 Profile Contents 

We describe the seven categories of data types we found in actual 
behavioral profiles. We provide a summary with examples in 
Table [D For demographic, geographic, technical and life event 
categories, we describe all the data types we found. However, for 
psychographic, behavioral and predictive categories, the number 
of data types that we found are many, and, hence, we discuss 
representative examples. Further, for each category, we contrast 
what we found with the data that we may find if we examine 
more profiles. 

5.2.1 Demographic data 

Demographic data contains individual, household and firmo- 
graphic subcategories. Companies associate individual’s full 
name, full postal address, mobile number, email address and 
email activity date with both demographic and other categories 
discussed below 1291 . 

Individual demographic: We found gender, age (e.g. 20-24 
years), marital status, education level (e.g. Some College), occu¬ 
pation (e.g. IT Professional), voter indicator, parent (e.g. De¬ 


clared Mom), home ownership (e.g. Home owner or Renter) 
and languages. We found age, but companies also have date 
of birth 10. In addition to voting, they have party affiliation 
(e.g. Democrat) and political donor (e.g. Contribute conserva¬ 
tive) data 1^ 0 . Other information include religious affiliation 
(e.g. Hindu), race/ethnicity (e.g. Arabs), family position (e.g. 
Female head of household) and summarized credit statistics in¬ 
cluding wealth rating (e.g. Decile), credit rating (e.g. High) and 
net worth EH IH 0 . 

Household demographic: It includes details of an individual’s 
household. We found household income (e.g. $20K-$30K), 
household size (e.g. 1), number of adults (e.g. 1), children in 
household (e.g. No), home type (e.g. Multifamily Dwelling), 
median home value (e.g. $0-$100K), length of residence (e.g. 
Less than 3 years), discretionary spending (e.g. $40K-$50K) and 
auto (e.g. Less than $20K). In individual demographic, we did 
not find individual income, but when household size or number 
of adults is one, then household income implies an individual’s 
income. 

For household demographic, companies have rich set of addi¬ 
tional attributes. In addition to knowing presence of children in a 
household, they know number of children, their gender and age, 
which can be a range (e.g. 0-3 years, 4-7 years) or month, day 
and year of birth ll9l l4l l28ll . They have indicators for the types of 
persons in a household, for example, presence of smoker, veteran 
in household, elderly parent in household ||9l . Further, they have 
data about mortgage and refinance (amount, term, loan type, rate 

type) ED- 

Firmographic: It generally includes details about an individ¬ 
ual’s profession and affiliated organizations. We found type of 
industry (e.g. College and University), number of employees 
(e.g. 1-20 employees), and characteristics of the profession (e.g. 
High Net Worth) and position (e.g. Technical Business Decision 
Maker). Additionally, companies have data about sales revenue, 
years of establishment (e.g. <2 years), domain expertise and se¬ 
niority 0. 

5.2.2 Geographic data 

Geographic data includes location and neighborhood of a user. 
For example, we found “US > Pennsylvania > Pittsburgh,” 
“US>Massachusetts>Boston-Cambridge-Quincy” and “Ocean- 
side, California” for a participant that currently lived in Pitts¬ 
burgh and Boston, and had lived in Oceanside about five years 
ago. The smallest granularity we found was at the city/county 
level. However, companies have geographical data at the level of 
full postal address. Zip code +4 (block level) and Zip code 0 
ED. For example, one company from BlueKai Marketplace 
claims to have 208 million postal addresses lIZTl . and 72 mil¬ 
lion postal addresses linked to email addresses 0. Each postal 
record is linked to a consumer’s demographic, interests and be¬ 
havioral data. 

5.2.3 Technical data 

Technical data generally includes information related to users’ 
computers and devices used to access the Internet. We found 
IP address (e.g. 71.182.182.9), operating system (e.g. Windows 


7), browser (e.g. IE 10), color depth and screen resolution. In¬ 
terestingly, companies may use IP address to identify an anony¬ 
mous consumer visiting a website in real-time. For example, 
they can map an IP address to a consumer’s full name, full postal 
address, mobile number, purchases, interests and ~260 more at¬ 
tributes 1291 . They also use IP address to infer location, for ex¬ 
ample, Yahoo states, “We use the IP address to infer your loca¬ 
tion ...” 

We did not find technical data regarding browser cookies and 
online activities and interactions, for example, search history, 
websites visited, articles read, comments, ratings and uploaded 
files. However, companies collect such information to derive 
psychographic, behavioral and predictive data. They use browser 
cookies to identify a website visitor’s gender, presence of chil¬ 
dren (Yes or No), age (e.g. 20-29) and household income (e.g. 
75,000-99,999)121. 

5.2.4 Predictive data 

Companies generally employ proprietary models and algorithms 
that combine data from multiple public, proprietary and self- 
reported sources, both online and offline, to make predictions 
about users. Predictions can be made about behavior, attitude, 
interest etc. For our predictive data category, we consider data 
that indicates user’s intent to purchase, usually in the near future. 
We discuss other types of predictions as part of other categories 
discussed below. 

We found examples that predicted purchases related to credit 
card, personal health, higher education, computers, cell phones, 
auto insurance, flying, hotels etc. For example, “Credit Card 
App Intent Score - 10-11%” indicates intent to apply for a credit 
card. “Personal Health - Values 70-90%” indicates future pur¬ 
chase propensity regarding personal health products; “In-Market 
- Cell-Phones and Plans” and “In-Market - US Domestic Flyers” 
indicate that the user is currently shopping for cellphone plans 
and flights; “Auto insurance online buyer - High Propensity” and 
“Online Higher Education Enrollee - High Propensity” indicate 
users looking to buy insurance or enroll in courses. Companies 
have in-market data for many other areas including real estate, 
apartments and automotive purchases lH. 

5.2.5 Psychographic data 

Psychographic data generally includes interests and attitudes of 
a user. We found interests related to health (e.g. Bones, Joints, 
Muscles > Pain, Weight Conscious Code - Value Tiers 1-3), re¬ 
ligion (e.g. Interest in Religion Code - Value Tiers 1-3, Chris¬ 
tian Music Code - Value Tiers 1-3), travel (e.g. Destinations > 
New York, Vacation Packages), automotive (e.g. Coupe), sweep- 
stakes, news (e.g. News and Politics > Government) etc. Com¬ 
panies possess additional data including gambling, lottery, alco¬ 
hol and tobacco Q. 

Profiles can include data on attitudes and values of users. 
Companies can use that information to trigger desired response 
from users. Some of the attitudes we found are as follows. “Buy 
American - Most Fikely,” which may indicate relatively high im¬ 
portance of pride in decision making. “Work Hard, Play Hard - 
Not Fikely,” which may indicate users’ desire to be at the fore¬ 
front of both their career and outside relative to their peers. “Stop 
and Smell the Roses - Most Fikely,” which may indicate a belief 


Table 1: Examples of Data Types Found in User Profiles 


Category 

Examples 

Demographic 


Individual 

Female 

Single 

20-24 years 

Some College 

IT Professional 

Voter 

Household 

Income Range - $20K-$30K 

Household Size - 1 

Children in Residence - No 

Home Value - $0-$100K 

Fength of Residence - Fess than 3 years 
Auto - less than $20K 

Firmographic 

Business Data > Micro (1-20 
employees) 

Business Data > Software 

Geographic 

US > Pennsylvania > Pittsburgh 
Oceanside, California 

Technical 

IP address-71.182.182.9 

OS - Win7 

Browser - IE 10 

Screen resolution - 1067X667 

Predictive 

Credit Card Interest Score - 16-17% 
Credit Card App Intent Score - 10-11% 
Auto insurance online buyer - High 
Propensity 

In-Market - Cell-Phones and Plans 

Psychographic 


Interests 

Health > Bones, Joints, Muscles > Pain 
Interest in Religion - Value Tiers 1-3 
Sweepstakes - Value Tiers 1-3 

Weight Conscious Code - Value Tiers 

1-3 

Travel Destinations > New York 

Attitudes 

Buy American - Not Fikely 

Show me the Money - Most Fikely 

Behavior 


Activities 

OTC Medicine > Pain Reliever 

Gastrointestinal - Tablets 

Offline CPG Purchasers > Brand > 
Hebrew National 

Charmin Ultra Soft 


Past purchase > ISP > Internet > 
Verizon 

Fifestyle 

Green Fiving 

Owns a Regular Amex Card 

Eco Friendly Vehicle Owner 

Premium Channel Viewer 


Fife Event Empty Nesters 
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Figure 3: Listing of Consumer Packaged Goods in a profile 


in altruism. 


5.2.6 Behavioral data 

Behavior data contains data related to users’ lifestyle, activities 
and personality. For example, the entry, “Green Living,” found 
in one of the profiles indicates that the user exhibits an environ¬ 
mentally friendly lifestyle. Companies can further differentiate 
between users who act and those who only think (e.g. Behav¬ 
ioral Greens vs. Think Green), and between undecided and those 
who are against (e.g. Potential Green vs. True Brown) ||9l. The 
profile containing “Green Living” also contained “Eco Friendly 
Vehicle Owner.” Other lifestyle aspects we found include credit 
(e.g. Owns a Regular Amex Card), finance (e.g. Owns Mutual 
Funds), shopping (e.g. Discount Shopper) and travel (e.g. Theme 
Park Visitor). 

In activities, we consider past purchases, both offline, such as 
stores and pharmacies, and online. One participant had over- 
the-counter medications (e.g. OTC Medicine > Pain Reliever, 
OTC Medicine > Cough and Cold) purchased at a local phar¬ 
macy listed in her profile. In addition to OTC, companies have 
information about medications (e.g. oral contraceptive, Lipitor, 
Insulin) purchased by users and any ailments they may have (e.g. 
Alzheimer’s, clinical depression, Diabetes-2) a. 

Another participant had a list of ~300 past consumer pack¬ 
aged goods (CPG) purchases in his profile. We list part of his 
profile in Fig. [3 CPG entries include brands (e.g. Hebrew Na¬ 
tional, Haagen-Dazs) and items (e.g. Charmin Ultra Soft, Gas¬ 
trointestinal - Tablets, General Mills > Fiber One). Companies 
have other data such as purchase of alcohol and tobacco E). 
Lastly, companies have built models to predict an individual’s 
personality type (e.g. introvert, leader). They have assigned per¬ 
sonality types, by name and postal address, to 85% of the US 
adult population ll^ IH . 


5.2.7 Life event data 

Life event data indicates certain events in a user’s life that may 
lead to changes in behavior and/or create specific needs. We 
found “Empty Nester,” which may indicate that the user’s chil¬ 
dren have left for college. Other life events that companies focus 
on include new movers and new parents aisi. 


6 User Concerns 

As part of the interviews, we asked participants (n=8) about their 
concerns and surprises regarding the data in their profiles. Below, 
we discuss their concerns and surprises. Note that participants 
viewed their own profiles, which varied among participants. 

Collection of sensitive data: Participants expressed surprise 
and/or concern about credit and health information. One partici¬ 
pant was surprised by credit information “Credit Card App Intent 
Score - lO-11%” and “Credit Card Interest Score - 16-17%.” He 
was concerned because he did not understand the meaning or 
implication of the credit information in his profile. A partici¬ 
pant who found a over-the-counter medication “OTC Medicine 
> Pain Reliever” said that it scared her. She had recently pur¬ 
chased pain medications from pharmacy for an injury. Another 
participant who had “General health > bones, joints, muscles > 
pain” considered the data confidential and did not want it to be in 
his profile. As result of an injury suffered during an accident, the 
participant was in pain for a prolonged time. He had not shared 
the details with other people. In his opinion, extracting this infor¬ 
mation from a few online searches and reflecting it in his profile 
was akin to sharing the information with others. 

Combining data and extent of collection: One participant 
who had an extensive profile with ~570 items was surprised 
and concerned by the amount of data gathered. The participant’s 
profile contained demographic - e.g. age, gender, household in¬ 
come - location, past purchases including a comprehensive list of 
~300 offline consumer goods purchases etc. The participant was 
surprised about how all the information was obtained without his 
knowledge or consent. Eurther, he was concerned to see his data 
from multiple sources being combined. He explained that it is 
okay for individual companies to have data about his business 
with these companies, for example, cellphone company knowing 
about cell phone plans, or pharmacy knowing about consumer 
goods purchases. However, a third party combining data from 
multiple sources and building profiles was not okay to him. The 
participant mentioned that it was not clear how all this would 
affect him. 

Granularity of data: Eor some data types, the concern was 
regarding the granularity or level of detail. One participant was 
okay with broad interest categories, but not with specific cate¬ 
gories. Eor example, he was not concerned to see “Web Ser¬ 
vices” listed under interests. However, he would be concerned if 
a specific instance such as “Pirate Bay” was listed. Another par¬ 
ticipant was concerned about granularity of retention period. He 
pointed out that a health condition listed in his profile was more 
than five years old. The participant had forgotten about it, but 
the information was still present in his profile. The participant’s 
concern is similar to the “right to be forgotten” argument ll22ll . 

Data use: Participants were concerned about how the data in 
their profiles may be used. One of the participants, who had 
credit scores listed in his profile, was concerned about its impli¬ 
cations. One more participant expressed similar sentiment when 
he said it was not clear how the extensive collection and com¬ 
bining of data would affect him. Both the participants were in- 
















directly, if not directly, thinking about the purposes for which 
the data may be used. Another participant was more direct: he 
felt that data can be used to infer actions performed by the user. 
He was concerned that by combining different interests, for ex¬ 
ample, Pirate Bay and Movies, one could conclude that he had 
downloaded movies illegally. 

Accuracy of data: All profiles had errors to varying degrees, 
and errors were found among all data types. In general, partic¬ 
ipants were not concerned when the data was incorrect. A par¬ 
ticipant even stated that he was happy that there were so many 
errors. Participants, however, became concerned when the data 
in the profile was correct. For example, one participant initially 
found many entries regarding credit and income, but was not 
concerned. This was because the entries consistently, but erro¬ 
neously, stated that the participant was affluent with 350000H- 
income, had top 1% credit and owned American Express card. 
However, after seeing an OTC medication entry that was correct, 
the participant said, “Now I am scared.” Later, this participant 
hypothesized that companies added incorrect data to profiles so 
users would not worry too much. A participant expressed con¬ 
cern when only two out of twelve entries regarding professional 
interests were correct. One reason that contributed to user con¬ 
cern was the level of detail or specificity of the correct entries. 
Only one participant pointed out that he would be concerned 
about incorrect data if it was used to make adverse decisions 
about him. This is interesting as it highlights the importance of 
accuracy in behavioral profiles as perceived by users. 

Editing profile data: In general, participants did not try to cor¬ 
rect erroneous data in their profiles. Two participants said that 
correcting the data would enable companies to track them fur¬ 
ther. A second reason was that the implications of editing data 
was not clear. One of the participants asked “What does edit 
mean? Is the data deleted from all sources?” However, we hy¬ 
pothesize that users may want to correct the data in their profiles 
if erroneous data may lead to decisions that adversely impact 
them. For example, a user interested in loans may want to cor¬ 
rect mortgage amount, credit card score or median bankruptcy 
score if she believes that a loan company may use that data. 

7 Online Survey 

We conducted an online survey (n=100) to validate the identified 
concerns with a larger and more diverse population. This survey 
had two purposes. First, we wanted to confirm whether a more 
diverse population of users agreed with the concerns that we had 
identified from the interviews. Second, we wanted to identify po¬ 
tential additional user concerns and data types that may not have 
been observed in the interviews. We recruited survey participants 
from Amazon Mechanical Turk crowd-sourcing platform 13 . We 
provide the survey questionnaire in Appendix lAl 

7.1 Survey Design 

To understand participant demographic, we asked them their age, 
gender, primary occupation and education level. To understand 
their technical background, we asked them whether they had a 


college degree or work experience in computer science, soft¬ 
ware development, web development or similar computer-related 
fields. We also asked them how much they liked personalization 
of ads on websites. We gathered information on demographic, 
background and liking for personalization as they may affect par¬ 
ticipant concerns. We also used demographic data to analyze di¬ 
versity of our participant population. 

We used a sample profile shown in Fig. |4] to understand 
whether the survey participants agreed with the concerns that we 
identified from the interviews. We used the sample profile to 
understand their concerns regarding collection of sensitive data, 
amount of data, combining data from multiple sources, level of 
detail and data use. We felt that survey participants could not 
provide meaningful answers regarding concerns of accuracy of 
information and editing profile data based only on a sample pro¬ 
file. Hence, we did not ask them about those concerns. 

We created the sample profile using data from profiles of the 
interview participants. To understand concerns about sensitive 
data collection, we added items related to credit (Credit Card In¬ 
terest Score 8-9%) and health (Personal Health - Values 70-90%) 
both of which our interview participants had found sensitive. We 
also added entries related to religion (Interest in Religion Code - 
Value Tiers 1-3), individual demographic (Female and Declared 
Mom) and household demographic (Income Range $75K-$99K). 
To address the concern on amount of data, we ensured that the 
profile had data items from several categories: demographic, 
psychographic, behavior and predictive. Geographic category 
was represented by the “Location and Neighborhood” tab. To 
show data being combined from multiple sources, we added an 
offline CPG purchase (Offline CPG Purchasers > Vicks). To 
cover concern about level of details, we picked items that were 
very specific “Interest > Video Games > Sony > PlayStation 3.” 
Further, the predictive values such as “Values Tiers 1-3” also in¬ 
creased the specificity of items. Lastly, we felt that it would be 
more realistic to show the data items as they appeared in actual 
profiles; a user looking at her actual profile would not have ad¬ 
ditional explanations or links to documents that could clarify her 
ambiguities. 

Before showing the sample profile, we explained to the par¬ 
ticipants that advertisers collected data about them in order to 
personalize ads. Further, advertisers may create profiles about 
them using the collected data. We then showed them a sample 
profile (Fig. Hli- To check whether participants were paying at¬ 
tention, we asked them to select, from a list of six items, at least 
two items present in the sample profile. We then asked the par¬ 
ticipants to rate, on a 5-point Likert scale of “Strongly disagree” 
to “Strongly agree,” how much they agreed or disagreed with the 
following list of concerns. We randomized the order in which the 
concern statements were displayed. 

1. I am concerned because I believe that the profile contains 
sensitive data 

2. I am concerned by the amount of data in the profile 

3. I am concerned because my data from multiple sources (e.g. 
online activities, in-store, other companies) is being com¬ 
bined 

4. I am concerned by the level of detail (e.g. specific informa¬ 
tion, not just broad categories) in the profile 
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Figure 4: Sample profile used in online survey 


5. I am concerned about how my data may be used 


contains sensitive data 
amount of data 
data from multiple sources combined 
level of detail 
how my data may be used 
■ Strongly disagree ■ Disagree 



Neither ■ Agree ■ Strongly agree 


Figure 5; Percentage (x-axis) of survey participants (n=100) who 
agreed with indicated concerns (y-axis) 


maker. Our survey participant pool was more diverse than our 
interview participant pool especially in education level, occupa¬ 
tion and technical background. Thirty six participants agreed (8 
strongly agree, 28 agree) that they liked personalization of ads, 
and 39 disagreed (12 strongly disagree, 27 disagree). Hence, the 
pool was balanced in its opinion of personalization. 


After the participants rated the concerns, we asked them, using 
an open-ended question, whether they had any other concerns re¬ 
garding the sample profile. We also asked them, using a 5-point 
Likert scale, if their liking for personalization had decreased af¬ 
ter seeing the types of data collected for personalization. We 
were interested in knowing if awareness of behavioral profiles 
can change participants’ opinions. 

Since we could not address, with a sample profile, concerns 
regarding accuracy of information and editing profile data, we 
gave participants the option of looking at their own profiles. We 
made this step optional, to know whether participants were really 
interested in looking at their own profiles. We stated that their 
payment and bonus were not affected if they chose not to look 
at their profiles. For participants who chose to look at their own 
profiles, we provided instructions to access BlueKai, Google and 
Yahoo profiles. We then gave these participants an option to de¬ 
scribe their reactions. This also helped us identify any additional 
concerns or data types. Lastly, we asked all participants if they 
had any further comments. 

7.2 Participant Background 

We recruited participants (n=100) from Amazon Mechanical 
Turk crowd-sourcing platform a. Our participants were at least 
18 years of age and located in the United States. We used the Me¬ 
chanical Turk location feature to ensure that users were from the 
United States. We collected informed consent from our partici¬ 
pants. We offered a payment of $0.5 for completing the survey 
and a $0.3 bonus for following the survey instructions correctly. 
We implemented our survey on the Survey Gizmo platform, and 
redirected participants from Mechanical Turk to Survey Gizmo. 

The average age of the participants was 27.74 years (SD — 
7.57) and median was 26 years. The male to female ratio was 
four to one. Thirty seven participants had completed a four 
year bachelors degree or higher. Twenty five participants had 
a college degree or work experience in computer science, soft¬ 
ware development, web development or similar computer-related 
fields. Twenty five participants were students, and the rest had di¬ 
verse occupations including administration, art, business, educa¬ 
tion, engineer, law enforcement, service, skilled labor and home¬ 


7.3 Survey Results 

Out of the 100 participants who completed the survey, 97 cor¬ 
rectly answered the attention check question, that is, they se¬ 
lected at least two items present in the sample profile. Two par¬ 
ticipants selected one item not present in the profile, and one 
selected two items not present in the profile. 

In Fig.|2 we show how survey participants (n=100) rated con¬ 
cerns regarding collection of sensitive data, amount of data, com¬ 
bining data from multiple sources, level of detail and data use. 
For each of the five concerns, at least 70% of the participants ei¬ 
ther agreed or strongly agreed that they were concerned. Partic¬ 
ipants were most concerned about how their data may be used 
(85%), followed by level of detail (77%), aggregation (75%), 
amount of data (73%) and collection of sensitive data (73%). 
Using a MANOVA, we found that the differences among user 
concerns were significant (F[4,96] = 3.9,p < .05). At least 70% 
agreement on each concern assures us that a more diverse popula¬ 
tion agrees with concerns that we identified from our interviews. 

We analyzed participant comments for additional concerns. 
Majority of the participants did not express new concerns. Seven 
participants were concerned about the security of their data; they 
worried that their data could be abused by hackers, criminals and 
identity thieves. Four participants expressed concerns that their 
data could be shared or sold to third parties, and accessed by the 
government. These are important and should be explored further. 

Fifty participants agreed (17 strongly agree, 33 agree) that 
their liking for personalization had decreased after seeing the 
types of data collected for personalization, and 23 disagreed (2 
strongly disagree, 21 disagree). Interestingly, 18 of those 50 par¬ 
ticipants were participants who liked personalization of ads. 

Seventy one participants (71%) chose to look at their own pro¬ 
files even when it was optional. This indicates that people are in¬ 
terested in learning about their behavioral profiles. This may also 
indicate that many people are unaware of profile access mecha¬ 
nisms provided by companies. This is similar to our interview 
pool where only one out of eight participants was aware of pro¬ 
file access mechanisms. 

Out of 71 participants, 51 (72%) chose to report their reac¬ 
tions. We analyzed their comments for concerns regarding ac- 
































curacy and editing profile data. Nine participants (17%) reported 
empty profiles. Twenty three participants (45%) reported inaccu¬ 
racies, and only three participants (6%)) reported that they found 
accurate profiles. Participants reactions to inaccuracies included 
“blatantly incorrect,” “80% inaccurate,” “somewhat dated” and 
“hilariously overestimated.” Recall that all our interview par¬ 
ticipants had also found varying levels of inaccuracies in their 
profiles. Most of the participants who reported inaccuracies and 
empty profiles explained that they felt relieved and less con¬ 
cerned about data collection. Only two participants (4%) felt 
that inaccuracies in their profiles could adversely affect them. 
Three participants mentioned about editing data. One of them 
corrected errors, and two of them deleted correct entries. Reac¬ 
tions of survey participants regarding inaccuracies in profiles and 
editing profile data appear similar to those of interview partici¬ 
pants. 

During analysis of participant reactions, we did not find any 
new data types. Lastly, we looked for comments that signaled 
difficulty in comprehending profile information. One partici¬ 
pant explicitly reported not being able to understand parts of his 
BlueKai profile. Two participants thought “High/Medium Confi¬ 
dence” was referring to their personality. Some of our interview 
participants had similar difficulties. Overall, our survey results 
confirm the results from our interviews. 

8 Limitations 

We studied profile contents from relatively small number of pro¬ 
files (n=8). We looked at behavioral profiles from three data 
aggregators, and all of them were cookie-based profiles. If we 
study larger number of profiles, profiles from other companies, or 
server-based profiles, we may find other types of data. We con¬ 
ducted in-person interviews with graduate students (n=8) with 
science and engineering background. For our online survey, we 
recruited participants (n=100) from Amazon Mechanical Turk, 
and they may have more technical knowledge than an average 
person. Further, our online survey results may contain self¬ 
selection bias. By recruiting participants from a more diverse 
pool, we may identify new concerns and surprises. Lastly, we 
can improve estimation of profile accuracy by asking participants 
to verify information on all entries in their profiles. 

9 Discussion and Conclusions 

Below we discuss insights from our study. We conclude with 
a discussion of implications for public policy management and 
directions for future research. 

User concerns are justified: Our study shows that participants 
have several concerns about behavioral profiles including extent 
of collection, collection of sensitive and confidential data, and 
level of detail. Our interview participants considered health and 
credit data sensitive. Profiles contained other data such as reli¬ 
gion and income types, which a more diverse audience may find 
sensitive. Further, our analysis of data aggregator documents 
shows that they have much more intrusive data including fully 
identifying data such as first and last names, and complete postal 
addresses. This can further exacerbate user concerns. 


Clarifying data usage is essential: The biggest user concern 
was how their data may be used. Use of profile data is not clear. 
Given the variety of data present in the profiles, its uses seem 
limitless. Data could be used for personalization, development of 
better products, or fraud detection. It could also be used for hir¬ 
ing decisions, discreet background checks or proselytizing. For 
a user, the impact of using her data for the former could be quite 
different from that of the latter. 

An important underlying issue is what inferences are permis¬ 
sible. The richness of profile data allows one to draw all kinds 
of inferences about a user. If a user liked race cars on Facebook, 
is he likely to speed? If a user brought OTC pain medications 
frequently, is she addicted to pain medications? Is a user who 
purchases LeanCuisine brand more healthy than a user who pur¬ 
chases Haagen-Dazs brand? Is a user who regularly buys He- 
brewNational brand Jewish? 

Claims of anonymity of profiles are misleading: Companies 
overlay anonymous data such as financial records with identify¬ 
ing information obtained from public, online and offline sources. 
This action of combining information from multiple sources not 
only creates a rich, 360-degree view of all aspects of life, but also 
associates it with a specific individual, her name, address and 
other personal information. Statements that imply that profile 
data are anonymous or pseudonymous, for example “Consumers 
can also control their anonymous profile G],” are misleading. 

Accuracy of profiles is poor: Our study shows that a large 
number of behavioral profiles contain inaccuracies. All inter¬ 
view participants (8/8) and 45% (23/51) of survey participants, 
who provided feedback about their profiles, reported errors. This 
violates an important fair information practice principle: the data 
quality principle. Although companies seem to be verifying the 
accuracy of the data that they obtain a, it is not clear how ef¬ 
fective their processes are. Since data is being combined from 
multiple companies, a few companies taking steps to ensure cor¬ 
rectness may not be sufficient. 

Some companies claim that their sources are accurate as they 
are “self-reported” by users and not modeled or predicted. The 
correctness of these self-reported sources are questionable. Users 
may be taking surveys or registering without being aware of im¬ 
plications in a different context. In fact, research has shown that 
people deliberately provide fake data as a way of protecting their 
privacy online ll20l . There are many other ways in which er¬ 
rors may be introduced: sharing a store loyalty card with another 
shopper who forgot her card, browsing from a friend’s account, 
or purchasing items for your employer. 

It is also important to consider the accuracy of predictive data. 
It is debatable how accurate the results are when a company pre¬ 
dicts religious affiliation, country of origin, ethnicity and lan¬ 
guages spoken, based on an individual’s name 0. Further, de¬ 
sired level of accuracy would depend on the type of data (likeli¬ 
hood of buying toilet paper vs. median bankruptcy score) and its 
potential uses (advertising vs. hiring). 

Interestingly, users were generally not concerned to see in¬ 
accuracies. Many felt relieved and did not want to correct the 
errors. Users appeared to be thinking mainly about compa¬ 
nies tracking them, and having incorrect information about users 
seemed to defeat that purpose. However, users also worried over 
how their data may be used. Decisions based on erroneous data. 


for example, fraud detection based on incorrect purchases or job 
screening using incorrect personality type, may adversely impact 
users. Hence, we hypothesize that users will start caring about 
inaccuracies as they become more aware of its implications. 

Effect of editing/deleting profile data is unclear: Some study 
participants deleted data from their profiles to ensure that compa¬ 
nies no longer have data about them. Are edit mechanisms meet¬ 
ing this expectation? There are several questions about the effect 
of editing or deleting profile data. Do all companies that possess 
a user’s data honor a user’s request? For example, BlueKai pro¬ 
file shows data that its affiliates may have about the user. Does 
deleting data from a BlueKai profile guarantee that the data is 
deleted from its affiliates databases? When a user corrects an 
erroneous entry in a profile, is that information propagated to 
companies that acquired the profile data? We need to clarify the 
implications of edit and delete. Otherwise, they only provide a 
false sense of comfort to users. 

Transparency provided by access is insufficient: From our 
study, we believe that providing access to user behavioral pro¬ 
files is a step in the right direction; it improves transparency of 
data practices. However, the information provided via these ac¬ 
cess mechanisms is incomplete and insufficient. First, our study 
shows a large gap between the types of data companies show 
in user profiles and data that they actually possess about users. 
For example, profiles show age, but companies also have date of 
birth; profiles show city, but companies also have Zip, ZipH-4 and 
postal addresses; and companies state profiles are anonymous, 
but they have full names. Second, some companies that provide 
access are more transparent than others, for example, BlueKai vs. 
Yahoo or Google. Lastly, profiles show information about data 
types, but not about how and when they were acquired or in¬ 
ferred. Further, they do not show information such as frequency 
of purchase. These details are important to meet the goal of im¬ 
proving transparency into company data practices. 

Usability of access mechanism needs improvement: Our 

participants had difficulty in comprehending profile data. For 
example, a participant asked “What does MOB/branded data 
mean?” Another misunderstood the meaning of “High Confi¬ 
dence.” To understand the meaning of these and many other en¬ 
tries, we had to read many documents. There is a need to im¬ 
prove comprehensibility of profiles. Accessing profile data was 
not easy; it was not possible to download profile data for easier 
analysis. For example, each BlueKai page had only five entries 
and a user with 99 pages had to click on each page to see its 
contents. 

Implications for public policy management: Users would 
benefit if companies that create behavioral profiles provide bet¬ 
ter notice about collection, combining and potential uses of user 
data. Improving awareness of access mechanisms among users 
can also help users. At present, there seems to be little aware¬ 
ness, for example, only one out of eight interview participants 
knew about access mechanisms. Users would benefit if compa¬ 
nies get users’ consent before combining data from different con¬ 
texts. To alleviate users’ concerns regarding data use, companies 


could disclose the purposes for which they use profile data. Fur¬ 
ther, they could specify what inferences they draw and how their 
prediction models work. From a user’s perspective, stating that 
the company uses proprietary models, for example, “developed 
a proprietary algorithm that utilizes a consumers name, mailing 
address and 320 different data points to accurately assign a per¬ 
sonality type to 85% of US adult consumers lH,” may be in¬ 
sufficient. To address user concerns regarding level of detail of 
profile data, companies could explain the need for such level of 
detail. Lastly, users would benefit if companies ensure accuracy 
in profile data and address the issue of accountability for adverse 
impact arising from errors in profiles. 

Directions for future research: We need user studies to fur¬ 
ther evaluate usability of profile access mechanisms. Results 
from our study can inform research in related areas such as on¬ 
line behavioral advertising (OBA) company data practices and 
privacy notices. For example, studies that evaluate whether ac¬ 
cess matters to users M could use realistic behavioral profiles 
from our study; research on making privacy policies more us¬ 
able for users ll23l could extract and highlight, from a privacy 
policy, parts that are of concern to a user. Research on tracking 
have largely focused on tracking based on online Internet activ¬ 
ities iniiBiiniiii]. However, behavioral profiles contain data 
from multiple sources including offline sources, and these could 
be investigated. 
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A Online Survey Questionnaire 

[Consent instructions here] 

Important: Please think thoroughly before answering each 
question. Your precise responses are very important for us. We 
are not interested in what someone else thinks - we want to know 
what you think! You may give an incomplete answer or say you 
do not know 

1) We are interested in understanding how you experience things 
online. We will start by seeking your views about website 


advertising. Here, “website advertising” refers to ads that are 
displayed on the web pages that you visit. In a sentence or two, 
please tell us what you think about website advertising. 

2) What is your age (in years)? 

3) What is your gender? 

() Male () Female () Decline to answer 

4) Which of the following best describes your primary occupa¬ 
tion? 

[List of options here] 

() Other (Please specify);: 

() Decline to answer 

5) Which of the following best describes your highest achieved 
education level? 

() No high school 
() Some high school 
() High school graduate 
() Some college - no degree 
() Associates 2 year degree 
() Bachelors 4 year degree 

() Graduate degree - Masters, PhD, professional, medicine, etc. 

() Decline to answer 

6) Do you have a college degree or work experience in computer 
science, software development, web development or similar 
computer-related fields? 

() Yes () No () Decline to answer 

7) Advertisers can personalize ads on websites to ensure that the 
ads are relevant to you. 

Please indicate how much you agree or disagree with the 
following statement. 

I like personalization of ads on websites. 

() Strongly disagree () Disagree () Neither agree nor disagree ( 
) Agree () Strongly agree 

Advertisers collect data about you in order to personalize ads. 
Advertisers may create profiles about you using the collected 
data. 

The following is an image of a profile that shows the different 
types of data that advertisers collect about users like you. Please 
look through the entire image at your own pace, and then answer 
the following questions. 

[Sample profile (Fig. 5) here] 

8) Please select from the list below at least two items that appear 
in the sample profile. 

[ ] Male 

[ ] Credit Card Interest Score 8-9% 

[ ] Offline CPG Purchasers > Charmin Ultra Strong 
[ ] Personal Health (Values 70-90%) 

[ ] Interest in Religion Code (Value Tiers 1-3) 

[ ] Household Income (HHI) > Income Range $75,000 - $99,000 


[Randomize Q9 - Q13] 

We will ask you some questions to understand your reaction to 
the profile you just saw. It is important that you have looked 
at the different types of data in the profile before continuing. 
Please click next when you are ready. 

9) Please describe your reaction to the profile. [Fig. 5 shown 
here] 

Indicate how much you agree or disagree with the following 
statement. 

I am concerned because I believe that the profile contains 
sensitive data. 

() Strongly disagree () Disagree () Neither agree nor disagree ( 
) Agree () Strongly agree 

10) Please describe your reaction to the profile. [Fig. 5 shown 
here] 

Indicate how much you agree or disagree with the following 
statement. 

I am concerned by the amount of data in the profile. 

() Strongly disagree () Disagree () Neither agree nor disagree ( 
) Agree () Strongly agree 

11) Please describe your reaction to the profile. [Fig. 5 shown 
here] 

Indicate how much you agree or disagree with the following 
statement. 

I am concerned because my data from multiple sources (e.g. 
online activities, in-store, other companies) is being combined. 

() Strongly disagree () Disagree () Neither agree nor disagree ( 
) Agree () Strongly agree 

12) Please describe your reaction to the profile. [Fig. 5 shown 
here] 

Indicate how much you agree or disagree with the following 
statement. 

I am concerned by the level of detail (e.g. specific information, 
not just broad categories) in the profile. 

() Strongly disagree () Disagree () Neither agree nor disagree ( 
) Agree () Strongly agree 

13) Please describe your reaction to the profile. [Fig. 5 shown 
here] 

Indicate how much you agree or disagree with the following 
statement. 

I am concerned about how my data may be used. 

() Strongly disagree () Disagree () Neither agree nor disagree ( 
) Agree () Strongly agree 

14) Please explain if you have other concerns about the profile. 
You are almost done. 

We will ask you how you feel about personalized ads after 
seeing the profile. We will also give you a chance to look at 
your own profile. Please note that looking at your own profile is 
optional. 

15) Please indicate how much you agree or disagree with the 
following statement. 


After seeing the types of data collected for personalization, my 
liking for personalized ads on websites has decreased. 

() Strongly disagree () Disagree () Neither agree nor disagree ( 
) Agree () Strongly agree 

You looked at a sample profile. Would you like to look at your 
own profile and learn what data advertisers have about you? 

Please note that this is optional. Your payment and bonus will 
not be affected if you choose to skip looking at your own profile. 
However, what you learn may be beneficial to you. 

16) Would you like to look at your own profile? 

() Yes () No 

Thank you for choosing to look at your own profile. We believe 
it will be beneficial to you and us. 

Please copy and paste the following website link in a new tab 
or window to access your own profile. You should see a profile 
similar in appearance to the sample profile, 
http ://bluekai .com/registry/ 


Please note that the profile may not display properly if you have 
disabled browser cookies. You can try from a different browser 
if you have more than one browser installed. 

If you are not able to access your profile using the above link, 

you can alternatively try the following websites. 

https://aboutthedata.com/ (scroll to the bottom of the 

page to click on “See and Edit Marketing Data about You.”) 

http://www.google.com/settings/ads 

http://info.yahoo.com/privacy/us/yahoo/opt_out/ 

targeting/ 

17) Please tell us briefly what you found in your own profile and 
how you feel about it (optional but helpful for our research). 

18) Do you have any further comments? 

Thank you for taking our survey. Your response is important 
to us. Below is your confirmation code. You must retain this 
code to be paid - it is recommended that you store your code in 
a safe place (either by writing it down, or by printing this page). 


